2,331 research outputs found

    Automated census record linking: a machine learning approach

    Full text link
    Thanks to the availability of new historical census sources and advances in record linking technology, economic historians are becoming big data genealogists. Linking individuals over time and between databases has opened up new avenues for research into intergenerational mobility, assimilation, discrimination, and the returns to education. To take advantage of these new research opportunities, scholars need to be able to accurately and efficiently match historical records and produce an unbiased dataset of links for downstream analysis. I detail a standard and transparent census matching technique for constructing linked samples that can be replicated across a variety of cases. The procedure applies insights from machine learning classification and text comparison to the well known problem of record linkage, but with a focus on the sorts of costs and benefits of working with historical data. I begin by extracting a subset of possible matches for each record, and then use training data to tune a matching algorithm that attempts to minimize both false positives and false negatives, taking into account the inherent noise in historical records. To make the procedure precise, I trace its application to an example from my own work, linking children from the 1915 Iowa State Census to their adult-selves in the 1940 Federal Census. In addition, I provide guidance on a number of practical questions, including how large the training data needs to be relative to the sample.This research has been supported by the NSF-IGERT Multidisciplinary Program in Inequality & Social Policy at Harvard University (Grant No. 0333403)

    On the equality of Hausdorff and box counting dimensions

    Full text link
    By viewing the covers of a fractal as a statistical mechanical system, the exact capacity of a multifractal is computed. The procedure can be extended to any multifractal described by a scaling function to show why the capacity and Hausdorff dimension are expected to be equal.Comment: CYCLER Paper 93mar001 Latex file with 3 PostScript figures (needs psfig.sty

    How legislators respond to localized economic shocks: evidence from Chinese import competition

    Full text link
    We explore the effects of localized economic shocks from trade on roll-call behavior and electoral outcomes in the US House, 1990–2010. We demonstrate that economic shocks from Chinese import competition—first studied by Autor, Dorn, and Hanson—cause legislators to vote in a more protectionist direction on trade bills but cause no change in their voting on all other bills. At the same time, these shocks have no effect on the reelection rates of incumbents, the probability an incumbent faces a primary challenge, or the partisan control of the district. Though changes in economic conditions are likely to cause electoral turnover in many cases, incumbents exposed to negative economic shocks from trade appear able to fend off these effects in equilibrium by taking strategic positions on foreign-trade bills. In line with this view, we find that the effect on roll-call voting is strongest in districts where incumbents are most threatened electorally. Taken together, these results paint a picture of responsive incumbents who tailor their roll-call positions on trade bills to the economic conditions in their districts

    Capital Destruction and Economic Growth: The Effects of Sherman’s March, 1850-1920

    Full text link
    Working paper.Using General William Sherman’s 1864--65 military march through Georgia, South Carolina, and North Carolina during the American Civil War, this paper studies the effect of capital destruction on short- and long-run local economic activity, and the role of financial markets in the recovery process. We match an 1865 US War Department map of Sherman’s march to county-level demographic, agricultural, and manufacturing data from the 1850–1920 US Censuses. We show that the capital destruction induced by the March led to a large contraction in agricultural investment, farming asset prices, and manufacturing activity. Elements of the decline in agriculture persisted through 1920. Using information on local banks and access to credit, we argue that the underdevelopment of financial markets played a role in weakening the recovery

    The majority-party disadvantage: revising theories of legislative organization

    Full text link
    Dominant theories of legislative organization in the U.S. rest on the notion that the majority party arranges legislative matters to enhance its electoral fortunes. Yet, we find little evidence for a short-term electoral advantage for the majority party in U.S. state legislatures. Furthermore, there appears to be a pronounced downstream majority-party disadvantage. To establish these findings, we propose a technique for aggregating the results of close elections to obtain as-if random variation in majority-party status. We argue that the results from this approach are consistent with a phenomenon of inter-temporal balancing, which we link to other forms of partisan balancing in U.S. elections. The article thus necessitates revisions to our theories of legislative organization, offers new arguments for balancing theories, and lays out an empirical technique for studying the effects of majority-party status in legislative contexts

    Semi-Streaming Set Cover

    Full text link
    This paper studies the set cover problem under the semi-streaming model. The underlying set system is formalized in terms of a hypergraph G=(V,E)G = (V, E) whose edges arrive one-by-one and the goal is to construct an edge cover F⊆EF \subseteq E with the objective of minimizing the cardinality (or cost in the weighted case) of FF. We consider a parameterized relaxation of this problem, where given some 0≀ϔ<10 \leq \epsilon < 1, the goal is to construct an edge (1−ϔ)(1 - \epsilon)-cover, namely, a subset of edges incident to all but an Ï”\epsilon-fraction of the vertices (or their benefit in the weighted case). The key limitation imposed on the algorithm is that its space is limited to (poly)logarithmically many bits per vertex. Our main result is an asymptotically tight trade-off between Ï”\epsilon and the approximation ratio: We design a semi-streaming algorithm that on input graph GG, constructs a succinct data structure D\mathcal{D} such that for every 0≀ϔ<10 \leq \epsilon < 1, an edge (1−ϔ)(1 - \epsilon)-cover that approximates the optimal edge \mbox{(11-)cover} within a factor of f(Ï”,n)f(\epsilon, n) can be extracted from D\mathcal{D} (efficiently and with no additional space requirements), where f(Ï”,n)={O(1/Ï”),if ϔ>1/nO(n),otherwise . f(\epsilon, n) = \left\{ \begin{array}{ll} O (1 / \epsilon), & \text{if } \epsilon > 1 / \sqrt{n} \\ O (\sqrt{n}), & \text{otherwise} \end{array} \right. \, . In particular for the traditional set cover problem we obtain an O(n)O(\sqrt{n})-approximation. This algorithm is proved to be best possible by establishing a family (parameterized by Ï”\epsilon) of matching lower bounds.Comment: Full version of the extended abstract that will appear in Proceedings of ICALP 2014 track

    A Time-Space Tradeoff for Triangulations of Points in the Plane

    Get PDF
    In this paper, we consider time-space trade-offs for reporting a triangulation of points in the plane. The goal is to minimize the amount of working space while keeping the total running time small. We present the first multi-pass algorithm on the problem that returns the edges of a triangulation with their adjacency information. This even improves the previously best known random-access algorithm

    Ultrasmall volume Plasmons - yet with complete retardation effects

    Full text link
    Nano particle-plasmons are attributed to quasi-static oscillation with no wave propagation due to their subwavelength size. However, when located within a band-gap medium (even in air if the particle is small enough), the particle interfaces are acting as wave-mirrors, incurring small negative retardation. The latter when compensated by a respective (short) propagation within the particle substantiates a full-fledged resonator based on constructive interference. This unusual wave interference in the deep subwavelength regime (modal-volume<0.001lambda^3) significantly enhances the Q-factor, e.g. 50 compared to the quasi-static limit of 5.5.Comment: 16 pages, 6 figure

    Linear Programming in the Semi-streaming Model with Application to the Maximum Matching Problem

    Get PDF
    In this paper, we study linear programming based approaches to the maximum matching problem in the semi-streaming model. The semi-streaming model has gained attention as a model for processing massive graphs as the importance of such graphs has increased. This is a model where edges are streamed-in in an adversarial order and we are allowed a space proportional to the number of vertices in a graph. In recent years, there has been several new results in this semi-streaming model. However broad techniques such as linear programming have not been adapted to this model. We present several techniques to adapt and optimize linear programming based approaches in the semi-streaming model with an application to the maximum matching problem. As a consequence, we improve (almost) all previous results on this problem, and also prove new results on interesting variants
    • 

    corecore